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Abstract. The Yule (pure-birth) model is the simplest null model of speciation; each lineage 
gives rise to a new lineage independently with the same rate A. We investigate the expected 
length of an edge chosen at random from the resulting evolutionary tree. In particular, we 
compare the expected length of a randomly selected edge with the expected length of a ran- 
domly selected pendant edge. We provide some exact formulae, and show how our results 
depend slightly on whether the depth of the tree or the number of leaves is conditioned on, 
and whether A is known or is estimated using maximum likelihood. 



1. Introduction 

In evolutionary biology, the simplest model of speciation assumes that, at any moment, each 
of the then-extant lineages randomly gives rise to a new lineage at some constant rate (and 
independently of other such events). This model, and an extension, was described by Yule in 
1924 [S]. It generates a rooted binary tree for which each edge has an associated random length 
- the duration of a lineage until it speciates (i.e. gives rise to a new lineage). The Yule model is 
widely used in phylogenetic analysis; often, extinction is also allowed, but in this short note, we 
deal only with the pure-birth model. 

Although many properties of the Yule model have been extensively investigated over the years 
(e.g. [21 HI E]); m this paper we consider a question that has received less attention - namely 
what can one say about the expected length of an edge selected uniformly at random from the 
set of pendant edges, or from all edges (pendant and interior)? 

We derive simple exact formulae for these quantities under two scenarios: either the number 
of leaves is given (but not the depth of the tree) or the depth of the tree is given (but not the 
number of leaves). We also evaluate these formulae when the diversification rate is replaced by its 
maximum likelihood estimate based on the depth of the tree and the number of leaves. We will 
work with expected average edge lengths (these being the same as the expected edge length of 
an edge selected uniformly at random from the appropriate class of edges - pendant or interior). 

Consider then a pure-birth Yule tree with diversification rate A. The time that a given lineage 
persists until it speciates has an exponential distribution with a mean of j. We will assume 
throughout that the tree starts as an initial bifurcation - that is, initially at some time t in the 
past, it has two lineages each of length 0, as in [5] • If there are k lineages present at a given 
moment, then the expected time until the next speciation event is also exponentially distributed, 
and with a mean of tv. After time t from the initial bifurcation, we produce a binary tree; the 
expected number of leaves in the tree is 2e A *. 

Since 4 is the expected time that a lineage persists until it speciates, it might be expected 
that the expected length of a randomly selected edge (pendant or interior) in a Yule tree would 
also be j. However, we will see that the expected value is either exactly or approximately equal 
to one-half this value, depending on what is being conditioned on. The intuitive explanation 
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for the i factor is that we are considering expected edge lengths in a bifurcating (binary) tree, 
rather than a linear sequence of events. 

2. Expected pendant vs. interior edge lengths as function (only) of n 

In this section, we show that regardless of when we observe a tree with n leaves, the expected 
length of a random interior edge length is . For a randomly chosen pendant edge (the length of 
the branch leading from a species back to where it first meets the rest of the tree), the expected 
value depends on when it is observed, but it converges to ^ as n grows, and is exactly equal to 
2^ under a null assumption concern the depth of the tree. 

Consider growing the Yule tree from the initial bifurcation until it has n + 1 leaves. Of course, 
the time (t) that this takes is a random variable, and we will suppose in this section that t is not 
known. Let P n be the expected value of sum of the lengths of the pendant edges of the tree on 
n leaves up to (just before) we first get n + 1 leaves, and let p n = P n /n be the expected value 
of the average pendant edge length. Similarly, let /„ be the expected sum of the lengths of the 
interior edges up to (just before) we get n+ 1 leaves, and let i n = I„/(n — 2) be the expected 
value of the average interior edge length. 

Theorem 1. For all n > 3, i n = Pn = j\- 

Proof: We have the following two recursions for n > 3: 

Pn-l 



(1) In = In 

(2) P n = Pn-l — 



n-1 

Pn-l , 2 



n — 1 An An 

Recursion |T]) follows by observing that the point at which n species arises creates a new interior 
edge from one of the n — 1 pendant edges, hence the last term. 

Recursion @ is more complex, but it combines the following observations: As the tree grows, 
from when it last has n — 1 leaves to when it last has n leaves, one of the pendant edges is selected 
uniformly at random from the n — 1 pendant edges and is destroyed, becoming the new interior 
edge (this is the second term on the right of @). The remaining n — 2 pendant edges get longer 
(this is the fourth term on the right of @), and two more new pendant edges arise (the third 
term on the right of ©). All these edges grow for an average of 1/An time (the expected time 
till the next event), since there are at present n species and record the growth of the tree until 
(just before) the next speciation event. Note that recursion (2) simplifies to: 

Pn = Pn-l ( 1 - ' ' ' 



-X) A 

This equation, combined with the initial condition P2 = 1/2A + 1/2A = 1/A (since the expected 
time of the transition from two to three leaves is 1/2A) has the closed-form solution: P n = n/2A. 
From this we can estimate the average expected length of a pendant edge: p n = — ■ P n = 4r 
Using this in recursion ([1]), along with the initial condition I2 — 0, and the fact that there are 
n — 2 interior edges, also gives us the expected length of an interior edge, i n : i n — • I n = ^y- 
In particular, i n = p n for n > 3. This completes the proof of Theorem [1] □ 

2.1. Remarks. Note that the identity in Theorem[T]is under the 'late sampling' scenario where 
the tree is observed just before the time of the next speciation event. But if one has n leaves, 
and one records the n pendant edge lengths at the 'earliest possible' time, namely when the n-th 
species first arises (rather than just before the (n + l)-st species appears) then the 'correction' 
for the average expected length of a pendant edge will be 1/2A - 1/nA. 
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Notice that if one records the pendant edge exactly half-way between these two expected 
times, this would give l/2A-l/2nA. 

However, if we observe that there are n leaves in a tree for which t is unknown and we ask what 
is the expected time that there have been n (rather than n — 1) leaves, then this expected time is 
1/nX rather than l/2nA, which restores our expected pendant edge estimate back to 1/2 A. This 
follows from a result by Gernhard (pQ, Theorem 5.2, case /i = with k = n — 1) which studied, 
more generally, the distribution of times between speciation events in a birth-death tree when 
the age of the tree is unknown and so is assumed to have a (improper) uniform prior (see also 
[2]). Thus, in the case where t is unknown, we may assume that the expected average pendant 
edge length is the same as for interior edges, namely 1/2 A. 

Notice that in any case, the possible 'corrections' all converge to as n increases. This simple 
observation that leaf edges are the same as interior edges is behind the otherwise somewhat 
non- intuitive assumption made by Nee [3J [5] that one can posit a speciation event at the present 
when calculating diversification rates, and the fact that Pybus' gamma [Sj can be estimated using 
all weighting times, even the most recent [4]. 

3. Expected average edge length in a Yule tree of given size and depth 

Let TL n (t) be the (random variable) sum of the branch lengths in a Yule tree T that has depth 
t and n leaves, and let L n (t) be the expected value of LT n (t). Thus l n (t) := L n (t)/(2n — 2) is 
the expected average branch length (since T has 2n — 2 branches). 

Theorem 2. Conditional on n,t and X, the expected value ofTL is given by: 

n — 2 

L n (t) = 2t + ^-{l-y(Xt)), 

where y{x) := 1 a ^ e _ x is a strictly decreasing function for x £ (0, oo) with y{0+) = 1, y{oo) = 0. 

Remarks Notice that (by analogy with the earlier section) we can write the expected average 
edge length as l n (t) = + 8 where the 'correction' term 5 = S(X,t) is given by 

( 1 1 n-2 y(Xt)\ 

S = t- [ - — — — - • ~ -y(Xt) 2X 

\n-l (2n-2)Xt 2n - 2 Xt J yv " 

where the approximation is for n large. Notice that y(Xt) — > as Xt — > oo. Notice also that we 
can also write L n (t) = t ■ (2 + (n— 2)z(Xt)), where z(x) :— 1 ~ V -} X " > — 1 + - — 1 _ 1 e -* ■ In particular, 
we can write L n (t) as a function of the form tH(Xt). 




FIGURE 1. Speciation times in a Yule tree of depth t. The values t — t\ > t% > 
tz > ■ ■ ■ > t± > t§ — measure time from the present. 



Proof of Theorem^' Let ...,i„_i be the (decreasing) times of the speciation events after 
an initial bifurcation at time t = ii in the past - this follows the notation of [7], but we use 
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n here for the number of leaves, not s, and we write t for t\ (see Fig. 1 for an example with 
n — 5). The density of this vector of i-values conditional on t, A and n is given by Eqn. (3) of 
[7] in the special case where p = = (and so in the notation of that paper pi(t) = e~ xt and 
v tl = 1 — e _At ) and is given as follows: 

. (n-2)!A- 2 exp(-AE;C 2 1 ^) 
( 3 ) /(*!». *. A ) = (l- e -At)»-2 ' 

Note that this is also consistent with Eqn. (5) of [5]. Now: 

/»— l 
(2t + 2t J )-/(t|n,t,A)dt ) 

since TL n (t) = (2t + 53?=2 an< ^ where integration is over all tuples {b%, t w _i) for which 
t > ti > < 2 > ■ ■ • > t n -i > 0. Now we can split up Q as follows: 

/n — 1 
•/(t|n,t,A)dt. 

From ([3]), the second term on the right-hand side of (fSJ) is: 

(„ _ o\| \n— 2 n-1 

(6) £y.exp(-A^^t. 

1 J Jt J=2 j=2 

Now we can exploit the fact that the term inside the integral sign of © can be written as: 

n— 1 n — 1 , n— 1 

(7) (J2 tj) • exp(-A Y, %) = ex P (-A £ tj), 

3=2 j=2 j=2 

and so, applying the Leibniz integral rule, the expression in ^ can be written as: 

(n-2)!A"- 2 
(1 - e - At )"- 2 

Now, 



j <■ n-1 \ 

■-y t exp(-Ag^)dtj 



/■ 2— J (^ - P -^\ n ~ 2 

(8) y t e X p(-AE^t= ^_ 2) , A L 

j — 2 

(by applying / t f(t\n,t, X)dt = 1 to ©). Thus, combining ©, © and © into J5]) gives: 

A"" 2 / d (l-e~ At )"- 2s 



L «(*)- 2 * + (l-e- A *)»- 2 V d\ A"- 2 
and the result now follows by routine calculus. □ 

3.1. Estimation of A from n,t. Given (just) n and t, the maximum likelihood estimate of A, 
which we denote Aml, is given by: 

(9) A ML = In Q) /*, 

Note that 2 divides n in this formula since we initially start with two species, and after time t, 
we observe n extant species. Eqn. ([9]) can be formally verified by differentiating Eqn. (4) in [5] 
with respect to A, and solving for A in the resulting expression. With this in hand, we can now 
state a consequence of Theorem [5] 
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Corollary 3. If we take A = Aml in the expression for M[TL] given by Theorem^ then: 

T , . (n - 2)t , , n - 2 

= 1^ ML = W 

Proo/: We have y(A ML t) = yQn (f )) - ^g^f- Thus: 

n-2 ( 21n(n/2)\ (n-2)t (n-2)t 

L n (t) = 2t+ - 1 - - K - ' )=2t+ \ > -2t= \ ' , 

Aml V J ln(f) ln(f) 

where the second equality uses ©• This gives Part (i); Part (ii) is an immediate consequence, 
again using ([9]). 

3.2. Remarks. Notice that Corollary 3(i) implies that for A = Aml, we can express l n (t) in 
the familiar form of ^ plus a 'correction term' that vanishes as n grows. More precisely, for 
A = Aml, we have: 

'»(" - s<' - ^t' - s- 

Nee |S] shows that, given a tree with branch lengths (and thereby n, t and the actual value of 
TP), the maximum likelihood estimator of A, which he denotes as A, is given by Eqn. 6 of [5] as: 

- _ n-2 

Comparing this with Corollary [^ii), we see a nice concordance: the ML estimate of A based on 
just n and t (i.e. Aml) is exactly the same value as the ML estimate of A (i.e. A) for an actual 
tree whose total length TL is equal to what it is expected to be under the Yule model for given 
n and t and A = Aml- 



4. Expected pendant vs. interior edge lengths as function (only) of t 

Let I = I(t) be the expected sum of the interior edge lengths of a Yule tree that has grown 
for time t. In contrast to the previous section, the number of leaves of this tree will be regarded 
as an unconstrained random variable. Similarly, let P = P(t) and L — L(t) be, respectively, the 
expected sum of the pendant (and of the total) edge lengths of a Yule tree that has grown for 
time t. Thus, 

1(0) = P(0) = L(0) = 0, and L(t) = I(t) + P{t). 

Theorem 4. 

I{t) = T(e At + e" At - 2) and P(t) = \(e xt - g- At ). 
A A 

Thus, if p(t) and i(t) are the expected average lengths of the pendant and interior edges of a Yule 

tree of depth t, then the ratio p(t)/i(t) converges to 1 exponentially fast with increasing t. 

Proof: From Theorem [21 L n (t) is a linear function of n. So, if we regard n as a random 
variable, rather than a given value, then L(t) is the expected value of L n (t) with respect to the 
distribution on n. Thus, since E[n] = 2e At , Theorem [2] gives: 

L(t) = 2t+^^(l-y(Xt)), 

which simplifies to: 

(10) L(t) = ^(e At -l), 
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Now, if the Yule tree has k species at time t, then the expected sum of interior edge lengths at 
time t + 5 is: 

P(t) 

(11) J(t) + SXk ■ — ^ + o(S) = I(t) + SXP(t) + o(5), 

k 

since I(t) increases precisely if a speciation event occurs in the interval (t, t + 8) (which has 
probability SXk + o(S)) in which case I(t) increases by the average length of pendant edges (+ 
o(S)), since one of the k pendant edges, selected uniformly at random, becomes a new interior 
edge). Notice that the right-hand side of (fTTj) is, fortunately, independent of k, and so: 

d2) dJ d r = Xp v- 

Writing P(t) = L(t) - I(t) in (JH]) and combining this with (JTDJ) gives: 
(13) *M + XI( t) = 2(e**-l). 

This is a standard first-order linear differential equation, for which the solution, subject to the 
boundary condition 1(0) — 0, is the expression for I(t) in Theorem 2] The remainder of the 
proof now follows easily. □ 

4.1. Remarks. If n takes its expected value 2e At , then Theorem [4] shows that i(t) and pit) is 
just plus 'correction terms' that converge rapidly to with increasing t. In a subsequent 
paper we will describe the analysis of branch lengths for the Yule model when both n and t are 
conditioned on, and when extinction is considered. The analysis in these cases is more complex, 
and beyond the scope of this short note. 
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